3 research outputs found

    Anaphora resolution for Arabic machine translation :a case study of nafs

    Get PDF
    PhD ThesisIn the age of the internet, email, and social media there is an increasing need for processing online information, for example, to support education and business. This has led to the rapid development of natural language processing technologies such as computational linguistics, information retrieval, and data mining. As a branch of computational linguistics, anaphora resolution has attracted much interest. This is reflected in the large number of papers on the topic published in journals such as Computational Linguistics. Mitkov (2002) and Ji et al. (2005) have argued that the overall quality of anaphora resolution systems remains low, despite practical advances in the area, and that major challenges include dealing with real-world knowledge and accurate parsing. This thesis investigates the following research question: can an algorithm be found for the resolution of the anaphor nafs in Arabic text which is accurate to at least 90%, scales linearly with text size, and requires a minimum of knowledge resources? A resolution algorithm intended to satisfy these criteria is proposed. Testing on a corpus of contemporary Arabic shows that it does indeed satisfy the criteria.Egyptian Government

    Muslim preachers’ pandemics related discourses within social media: A corpus-based critical discourse analysis

    No full text
    AbstractPandemics have been extensively represented in different discourse genres including journalistic discourse, media discourse, medical discourse, social media discourse, and academic discourse. This study explores the representation of COVID-19, Swine flu, and Monkey pox in the Arab Muslim preachers’ discourses on Twitter and Facebook. The Muslim preachers’ discourses remain one of the influential discourses that informs the ideology of its believers, as it is largely based on the Islamic authoritative discourses of the Quran and the Hadiths of Prophet Muhammad. The data set of 538 postings was generated through an extended observation of purposively recruited Arab Muslim non-mainstream scholars’ postings on Facebook and Twitter from March 2019 to August 2022. The data were analyzed using corpus-based critical discourse analysis. The twofold analytical lens involving CL and CDA revealed that Muslim preachers frequently used ideological semantic patterns in communicating to the Muslim society at large regarding the pandemics. The utilized semantic patterns emerged as embedded in certain ideological frames established in the Islamic authoritative discourses of the Quran and the Hadiths of Prophet Muhammad. In their ideological representation of the pandemics, Muslim preachers framed the entire three pandemics mostly as the wrath of God. Religious scholars’ postings cannot be considered an account of teaching and preaching; rather, they merely consume and produce Islamic ideology in a way to manipulate and influence Muslims’ knowledge of existing reality by adding new meanings in line with the chosen ideological frames

    Exploring the efficacy and reliability of automatic text summarisation systems: Arabic texts in focus

    No full text
    AbstractThis study compared the salient features of the three basic types of automatic text summarisation methods (ATSMs)—extractive, abstractive, and real-time—along with the available approaches used for each type. The data set comprised 12 reports on the current issues on automatic text summarisation methods and techniques across languages, with a special focus on Arabic whose structure has been largely claimed to be problematic in most ATSMs. Three main summarizers were compared: TAAM, OTExtSum, and OntoRealSumm. Further to this, a humanoid version of the summary of the data set was prepared, and then compared to the automatically generated summary. A 10-item questionnaire was built to help with the assessment of the target ATSMs. Also, Rouge analysis was performed to assess the efficacy of all techniques in minimising the redundancy of the data set. Findings showed that the precision of the target summarizers differed considerably, as 80% of the data set has been proven to be aware of the problems underlying ATSMS. The remaining parameters were in the normal range (65–75%). In light of the equations-based assessment of ATSMS, the highest range was noted with the removal of stop word, the least range was noted with POS tagging, stem weight, and stem collection. Regarding Arabic, the statistical analysis has been proven to be the most effective summarisation method (accuracy = 57.59%; reminiscence = 58.79%; F-Value = 57.99%). Further research is required to explore how the lexicogrammatical nature of languages and generic text structure would affect the text summarisation process
    corecore